Search CORE

16 research outputs found

Avoiding core's DUE & SDC via acoustic wave detectors and tailored error containment and recovery

Author: González Colás Antonio María
Upasani Gaurang
Vera Rivera Francisco Javier
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

The trend of downsizing transistors and operating voltage scaling has made the processor chip more sensitive against radiation phenomena making soft errors an important challenge. New reliability techniques for handling soft errors in the logic and memories that allow meeting the desired failures-in-time (FIT) target are key to keep harnessing the benefits of Moore's law. The failure to scale the soft error rate caused by particle strikes, may soon limit the total number of cores that one may have running at the same time. This paper proposes a light-weight and scalable architecture to eliminate silent data corruption errors (SDC) and detected unrecoverable errors (DUE) of a core. The architecture uses acoustic wave detectors for error detection. We propose to recover by confining the errors in the cache hierarchy, allowing us to deal with the relatively long detection latencies. Our results show that the proposed mechanism protects the whole core (logic, latches and memory arrays) incurring performance overhead as low as 0.60%. © 2014 IEEE.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Fifty Years of ISCA: A data-driven retrospective on key trends

Author: Jain Rutwik
Parthasarathy Nidhi
Patterson David
Ranganathan Parthasarathy
Sampson Adrian
Shah Shaan
Sinclair Matthew D.
Upasani Gaurang
Publication venue
Publication date: 08/06/2023
Field of study

Computer Architecture, broadly, involves optimizing hardware and software for current and future processing systems. Although there are several other top venues to publish Computer Architecture research, including ASPLOS, HPCA, and MICRO, ISCA (the International Symposium on Computer Architecture) is one of the oldest, longest running, and most prestigious venues for publishing Computer Architecture research. Since 1973, except for 1975, ISCA has been organized annually. Accordingly, this year will be the 50th year of ISCA. Thus, we set out to analyze the past 50 years of ISCA to understand who and what has been driving and innovating computing systems thus far. Our analysis identifies several interesting trends that reflect how ISCA, and Computer Architecture in general, has grown and evolved in the past 50 years, including minicomputers, general-purpose uniprocessor CPUs, multiprocessor and multi-core CPUs, general-purpose GPUs, and accelerators.Comment: 17 pages, 11 figure

arXiv.org e-Print Archive

Soft error mitigation techniques for future chip multiprocessors

Author: Upasani Gaurang R
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/02/2016
Field of study

The sustained drive to downsize the transistors has reached a point where device sensitivity against transient faults due to neutron and alpha particle strikes a.k.a soft errors has moved to the forefront of concerns for next-generation designs. Following Moore's law, the exponential growth in the number of transistors per chip has brought tremendous progress in the performance and functionality of processors. However, incorporating billions of transistors into a chip makes it more likely to encounter a soft soft errors. Moreover, aggressive voltage scaling and process variations make the processors even more vulnerable to soft errors. Also, the number of cores on chip is growing exponentially fueling the multicore revolution. With increased core counts and larger memory arrays, the total failure-in-time (FIT) per chip (or package) increases. Our studies concluded that the shrinking technology required to match the power and performance demands for servers and future exa- and tera-scale systems impacts the FIT budget. New soft error mitigation techniques that allow meeting the failure rate target are important to keep harnessing the benefits of Moore's law. Traditionally, reliability research has focused on providing circuit, microarchitecture and architectural solutions, which include device hardening, redundant execution, lock-step, error correcting codes, modular redundancy etc. In general, all these techniques are very effective in handling soft errors but expensive in terms of performance, power, and area overheads. Traditional solutions fail to scale in providing the required degree of reliability with increasing failure rates while maintaining low area, power and performance cost. Moreover, this family of solutions has hit the point of diminishing return, and simply achieving 2X improvement in the soft error rate may be impractical. Instead of relying on some kind of redundancy, a new direction that is growing in interest by the research community is detecting the actual particle strike rather than its consequence. The proposed idea consists of deploying a set of detectors on silicon that would be in charge of perceiving the particle strikes that can potentially create a soft error. Upon detection, a hardware or software mechanism would trigger the appropriate recovery action. This work proposes a lightweight and scalable soft error mitigation solution. As a part of our soft error mitigation technique, we show how to use acoustic wave detectors for detecting and locating particle strikes. We use them to protect both the logic and the memory arrays, acting as unified error detection mechanism. We architect an error containment mechanism and a unique recovery mechanism based on checkpointing that works with acoustic wave detectors to effectively recover from soft errors. Our results show that the proposed mechanism protects the whole processor (logic, flip-flop, latches and memory arrays) incurring minimum overheads.La nanotecnología ha continuado avanzando durante las últimas décadas al ritmo marcado por la ley de Moore, que dice que los transistores reducen su tamaño en un 50% cada dos años. Esta reducción en tamaño ha permitido que los transistores sean cada vez más rápidos y que consuman menos energía. Sin embargo, este avance tecnológico se enfrenta ahora al problema de la vulnerabilidad de estos pequeños transistores, sobre todo al impacto de las partículas (soft errors). Por otro lado, el uso que se hace hoy en día de estos transistores los hace aún más vulnerables a los posibles impactos de partículas. La reducción del voltaje que se usa en los procesadores actuales, el incremento de número de procesadores que hay en los dispositivos actuales, las variaciones en el proceso de fabricación... todo ayuda a que las partículas que impactan en los transistores causes errores. Nuestros estudios concluyen que la tecnología que se necesita para poder crear los futuros supercomputadores terascale y exascale va a ser muy susceptible a los impactos de partículas, y que nuevas técnicas para detectar y corregir los errores que causan van a ser imprescindibles. Las soluciones que se usan en la actualidad, basadas en modificación de circuitos y del diseño de los procesadores no van a poder usarse en los futuros superocomputadores terascale y exascale a un coste razonable. Una nueva clase de solución que se está investigando es la de detectar los impactos de las partículas, una solución totalmente opuesta a las direcciones anteriores basadas en detectar los errores que los impactos causaban. Nuestra solución consiste en poner un conjunto de detectores en el silicio que detectarían todos los impactos de partículas que potencialmente pudieran causar errores. Una vez el impacto es detectado, si fuera necesario aplicaríamos soluciones para recuperarnos del error que hubiera podido causar. En nuestro trabajo nos centramos en sensores acústicos. La tesis propone mecanismos que nos permiten detectar y localizar los impactos de partículas basados en estos sensores acústicos. Demostramos como se pueden usar para proteger los procesadores, lógica y memoria. También proponemos una solución que nos permite contener y recuperarnos de los errores que los impactos de partículas causan una vez se detectan a través de nuestros sensores. Los resultados demuestran que el coste para proteger los futuros supercomputadores terascale y exascale es razonable y suficiente.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Soft error mitigation techniques for future chip multiprocessors

Author: Upasani Gaurang R
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2016
Field of study

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Tesis Doctorals en Xarxa

Soft error mitigation techniques for future chip multiprocessors

Author: Upasani Gaurang R
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/02/2016
Field of study

A case for acoustic wave detectors for soft-errors

Author: González Colás Antonio María
Upasani Gaurang
Vera Rivera Francisco Javier
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

The continuing decrease in dimensions and operating voltage of transistors has increased their sensitivity against radiation phenomena, making soft errors an important challenge in future microprocessors. New techniques for detecting errors in the logic and memories that allow meeting the desired failure rate are key to keep harnessing the benefits of Moore's law. This paper proposes a low-cost dynamic particle strike detection mechanism based on acoustic wave detectors. Our results show that the proposed mechanism can protect the whole chip, including both the logic and the memory arrays, and detect all the soft errors caused by particle strikes with minimal hardware overhead and performance cost.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

PLANTX The universal tool for digital and analogue transmitter planning

Author: González Colás Antonio María
Upasani Gaurang
Vera Rivera Francisco Javier
Publication venue
Publication date: 01/01/1997
Field of study

SIGLEAvailable from British Library Document Supply Centre-DSC:1871.36013(1996/13) / BLDSC - British Library Document Supply CentreGBUnited Kingdo

UPCommons. Portal del coneixement obert de la UPC

OpenGrey Repository

Fire safety Firecode compliance

Author: González Colás Antonio María
Upasani Gaurang
Vera Rivera Francisco Javier
Publication venue
Publication date: 01/01/2002
Field of study

SIGLEAvailable from British Library Document Supply Centre-DSC:9294.517(2002/54) / BLDSC - British Library Document Supply CentreGBUnited Kingdo

UPCommons. Portal del coneixement obert de la UPC

OpenGrey Repository

Pre- and post-ultimate behaviour analysis and derivation of strength model of rectangular box girder

Author: González Colás Antonio María
Upasani Gaurang
Vera Rivera Francisco Javier
Publication venue
Publication date: 01/01/1987
Field of study

SIGLEAvailable from British Library Document Supply Centre- DSC:9110.2555(NAOE--87-27) / BLDSC - British Library Document Supply CentreGBUnited Kingdo

UPCommons. Portal del coneixement obert de la UPC

OpenGrey Repository

Setting an error detection infrastructure with low cost acoustics wave detectors

Author: González Colás Antonio María
Upasani Gaurang
Vera Rivera Francisco Javier
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

The continuing decrease in dimensions and operating voltage of transistors has increased their sensitivity against radiation phenomena making soft errors an important challenge in future chip multiprocessors (CMPs). Hence, new techniques for detecting errors in the logic and memories that allow meeting the desired failures-in-time (FIT) budget in CMPs are required. This paper proposes a low-cost dynamic particle strike detection mechanism through acoustic wave detectors. Our results show that our mechanism can protect both the logic and the memory arrays. As a case study, we also show how this technique can be combined with error codes to protect the last-level cache at low cost.Peer Reviewe

RECERCAT